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Abstract 

A range of applications in cognitive radio networks, from adaptive spectrum sensing to predictive 
spectrum mobility and dynamic spectrum access, depend on our ability to foresee the state evolution 
of radio spectrum, raising a fundamental question: To what degree is radio spectrum state (RSS) 
predictable? In this paper, we explore the fundamental limits of predictability in RSS dynamics by 
studying the RSS evolution patterns in spectrum bands of several popular services, including TV 
bands, ISM bands, and Cellular bands, etc. From an information theory perspective, we introduce 
a methodology of using statistical entropy measures and Fano inequality to quantify the degree of 
predictability underlying real-world spectrum measurements. Despite the apparent randomness, we find 
a remarkable predictability, as large as 90%, in the real-world RSS dynamics over a number of spectrum 
bands for all popular services. Furthermore, we discuss the potential applications of prediction-based 
spectrum sharing in 5G wireless communications. 

I. Introduction 

During the past decades, we have witnessed a dramatic growth in wireless access along with the 
popularity of smart phones, mobile TVs, and many other wireless services. The ever-increasing 
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demand for high data rates in the face of limited radio spectrum resources has motivated 
the introduction of cognitive radio (CR), which opens a potential communication paradigm to 
improve spectrum utilization by allowing secondary users to opportunistically access spectrum 
holes or white spaces unused by primary users. To enable CR, one fundamental challenge is 
how to reliably identify when and where spectrum holes exist. 

Spectrum sensing and spectrum prediction are known as two effective enabling techniques 
to identify spectrum holes. Briefly, spectrum sensing determines radio spectrum state (RSS) 
using various signal detection methods, which has been investigated extensively in the literature 
(see, e.g., a survey in [1]). Complementarily, spectrum prediction infers unknown/unmeasured 
RSS from historical known/measured spectrum data by exploiting the inherent correlation and/or 
regularity among them, which has gained increasing attention recently (see, e.g., a survey in [2]). 
Spectrum prediction has many merits, e.g., reducing the sensing time and energy consumption 
involved in adaptive spectrum sensing [3] and increasing the system throughput via prediction- 
based dynamic spectrum access [4], etc. 

To reap these benefits, a number of spectrum prediction techniques have been proposed, such 
as time series-based prediction, autoregressive model-based prediction, hidden Markov models- 
based prediction, neural networks-based prediction, and Bayesian inference-based prediction, etc 
(see, e.g., the survey in [2] and the references therein). However, so far it is not clear that for 
various frequency bands, what the upper-bound performance of various prediction techniques 
could be. Moreover, the RSS evolution patterns are generally determined by the human’s usage 
of radio spectrum. Although we rarely consider the human activity in radio domain to be totally 
random, current models of RSS evolution are fundamentally stochastic, see the most widely 
used continuous/discrete-time Markov chain models in [5]. Yet the probabilistic nature of the 
existing modeling framework raises fundamental questions: What is the role of randomness in 
RSS evolution and to what degree is RSS dynamics predictable? 

This paper attempts to study the interplay between the regular (and thus predictable) and 
the random (and thus unforeseeable) underlying real-world RSS dynamics theoretically and 
provide certain guidance over how to apply the predicted RSS to the design of future wireless 
communication systems technically. Specifically, from an information theory perspective, we 
introduce a methodology of using statistical entropy measures and Fano inequality to quantify 
the degree of predictability underlying real-world spectrum measurements and provide some 
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Frequency (MHz) 


Fig. 1. The 3-D view of the evolution trajectories of one week real-world RSS in the TV bands (614^698 MHz). For each 
200 kHz spectrum band, about 3360 samples one week and thus 480 samples one day are plotted. 


intuitive thoughts and conclusions. After validating the fundamental limits of predictability 
in RSS dynamics, this paper moves forwards by addressing the potential applications of the 
predicted RSS in 5G wireless communications. 

II. Spectrum Data Description and Preprocessing 

In order to ensure the reproducibility of the spectrum prediction analysis in this paper for 
other researchers, we uses a well-known open source real-world spectrum dataset from the 
RWTH Aachen University spectrum measurement campaign [5]. In this paper, we are primarily 
interested in several popular services, including TV bands, ISM bands, and Cellular bands, etc. 
The resolution bandwidth of each individual spectrum band is 200 kHz. The inter-sample time 
is about 3 minutes, which corresponds to 3360 samples one week for each 200 kHz spectrum 
band*. 

'The original inter-sample time in [5] is about 1.8 seconds, which results in 48000 samples one day and 336000 samples one 
week for each individual spectrum band. To facilitate the presentation and analysis, a preprocessing procedure in this paper is 
performed to obtain a new spectrum dataset by averaging consecutive 1000 samples. 
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As an illustrative example. Fig. 7 shows the evolution trajectories of one week real-world RSS, 
i.e., measured power spectral density (PSD) values, in TV bands. Several interesting phenomena 
can be observed. First of all, the RSS dynamics for various frequency bands are significantly 
different, several bands are heavily loaded but others not. Moreover, randomness and regularities 
coexist in the RSS evolution. Very strong signals can be identified in several TV bands, and, it 
appears that the temporal variations of signals in these bands are not that significant as those in 
other bands. 

To further show the spectrum utilization of each 200 kHz spectrum band. Fig. 8 plots the 
duty cycle over the frequency under two well-known detection thresholds. One threshold -107 
dBm/200 kHz has initially been proposed in the IEEE 802.22 working group for detection 
of wireless microphones in 200 kHz channels in the TV bands and the other more sensitive 
threshold -114 dBm/200 kHz has been specified in the FCC’s final rules [5]. As shown in Fig. 
8, the binary spectrum occupancy (BSO) is highly dependent on the selection of the specific 
detection threshold. 



Fig. 2. Impact of the detection threshold on the duty cycle in TV bands (614^698 MHz). 


Until now, to characterize the RSS dynamics, most of the existing studies have focused on 
BSO traces by analyzing the ON and OFF state evolution over time. Instead, in this paper, we 
will investigate the continuously measured PSD traces and analyze the predictability of the PSD 
evolution over time, mainly for the following concerns: the PSD is the original raw data, while 


August 24, 2015 


DRAFT 








































































5 


the BSO, obtained from the PSD by comparing with a detection threshold, inevitably introduces 
detection or sensing errors (e.g., false alarms and miss detections) [6]. 

III. Spectrum Prediction Analysis: To What Degree is Radio Spectrum State 

Predictable? 

In this section, we first perform the prediction analysis on each individual 200 kHz spectrum 
band separately, and then on the whole spectrum bands allocated to each service statistically. 

A. Entropy Analysis 

For a given spectrum band, let X, be a random variable representing its state at time slot i. 
The state of this band from time slot 1 to time slot n is a random variable series Xi, X 2 ,..., X„. 
Entropy is probably the most fundamental quantity characterizing the degree of predictability of 
a random variable series. In general, lower entropy implies higher predictability, and vice versa. 
Recently, entropy-based analysis have already been introduced in various prediction scenarios 
such as atmosphere [7], network traffic [8], and human mobility [9]. The basic idea is that 
entropy offers a precise definition of the informational content of predictions and it is renowned 
for its generality due to minimal assumption on the model of the studied scenario. 

Specifically, to facilitate the following entropy analysis of RSS dynamics, we first quantize the 
PSD values for each individual spectrum band into Q RSS levels. Then, let S = {Xi, X 2 ,..., Xn} 
denote the series or sequence of RSS levels occurred at n consecutive time slots and we have 
the following three entropy measures to characterize the RSS dynamics: 

• Random entropy = log 2 Q, capturing the degree of predictability of the given spectrum 
band’s evolution if each RSS level occurs with equal probability in each time slot. 

• Temporal-uncorrelated entropy == ~ SLi P* Pu where pi is the probability that 

the i-th RSS level occurred in the sequence S. also known as Shannon entropy or 

classical information theoretical entropy, is by far the most often used entropy metric, which 
characterizes the heterogeneity of the RSS evolution patterns without taking into account 
the history of the process. 

• Actual entropy log 2 where P{Si) is the probability of a 

particular time-ordered subsequence Si occurred in the trajectory of S. Thus, X®*^*®®^ depends 
not only on the occurrence frequency of each RSS level, but also the temporal order in 
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Fig. 3. The entropy of the RSS dynamics in TV bands. The number of RSS levels is set as Q = 8 for each individual 200 
kHz spectrum band and thus the random entropy = log 2 (Q = 8) = 3 bits. 


which the RSS levels occurred, and it captures the full frequency-time structure present 
in a given spectrum band’s revolution pattern. In practice, to calculate the actual entropy 
from the historical spectrum measurements, we use an estimator based on Lempel-Ziv data 
compression [10], which is known to rapidly converge to the actual entropy of a time series. 
For a time series with length n, the entropy is estimated by \n.n, 

where A* is the length of the shortest subsequence starting at the z-th time slot which doesn’t 
previously appear from time slot 1 to time slot i. 

Intuitively, we have 0 < < ^rand^ which is illustrated in Fig. 9 via analyzing 

the real-world spectrum measurements in TV bands. Extremely, if a spectrum band has actual 
entropy £;actuai ^ evolution is completely regular and thus fully predictable. If, 

however, a spectrum band’s actual entropy £'actuai _ ^land _ trajectory is expected to 

follow a quite random pattern and thus we cannot predict it with an accuracy exceeding 1/Q. As 
shown in Fig. 9, all spectrum bands have finite actual entropies between 0 and indicating 

that not only a certain amount of randomness governs their future whereabouts, but also that 
there is some regularity in their dynamics that can be exploited for predictive purposes. 

Based on the obtained actual entropy, in the following, we aim to quantify the limits of the 
predictability of a spectrum band’s next state based on its trajectory history. That is, we want 


August 24, 2015 


DRAFT 


















































































7 


to answer the question: How predictable is a spectrum band’s next state given the entropy of its 
historical trajectory? 

B. Predictability Analysis 

An important measure of predictability is the probability U that an appropriate predictive 
algorithm can correctly predict a spectrum band’s future state. This quantity is subject to Fano’s 
inequality [11]. That is, if an individual spectrum band with an actual entropy evolutes 

between Q RSS levels, its predictability U < where is determined by 

£:®ctual ^ -[TJmaxiQg^ jjmax + (1 _ _ TJmax^] + (1 - log2((3 - 1). 

Based on this relationship, for each spectrum band, we can obtain the upper-bound predictabil¬ 
ity, 77 ™®^, through numerical calculations given Q and 7 ;®^*®®', 

As an illustrative example. Fig. 10 shows the upper-bound predictability 77™®^ over each 200 
kHz TV band separately when the number of RSS levels is set as Q = 8 . For comparison, 
the predictability of independent identical distributed (i.i.d.) Gaussian noise data with one-week 
samples is also plotted. We have the following observations: 

• The predictability of real-world RSS data varies significantly for different spectrum bands. 
For example, there are a number of TV bands with the predictability higher than 0.95, 
which means that at most 5% of the time these spectrum bands change their states in a 
manner that appears to be random, and in the remaining 95% of the time we can expect to 
predict their whereabouts. On the other hand, we also see that there are a few TV bands 
with the predictability lower than 0.9, which means that no matter how good our predictive 
algorithms, we cannot predict with better than 90% accuracy the future states of these 
spectrum bands. 

• For all TV bands, the predictability of real-world RSS data are much higher than that of 
the i.i.d Gaussian noise data. This demonstrates that the temporal correlation or regularity 
in the real-world RSS data benefits the predictability. 

Furthermore, from a statistical perspective, Fig. 11 shows the cumulative distribution functions 
(CDFs) for the predictability (with Q = 8 ) of various services, including TV bands (614 ~ 698 
MHz), ISM bands (2400.1 - 2483.3 MHz), cellular bands (GSM1800 uplink 1710.2 - 1784.8 
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Fig. 4. The predictability in RSS dynamics for TV bands. 


MHz and GSM1800 dow nlink 1820.2~ 1875.4 MHz), and 2.3 GHz bands (2300~2400 MHz)^. 

We have the following observations: 

• Among all services, TV bands have the steepest CDF, with the minimum predictability 
0.8836. Comparatively, most bands in 2.3 GHz have relatively low predictability, with the 
minimum close to the predictability of i.i.d. Gaussian noise data, 0.7623. ISM bands have a 
CDF between TV bands and cellular bands, which implies that a larger (lower) proportion 
of ISM bands have higher predictability than those of cellular (TV) bands. 

• A predictability superiority of the GSM 1800 downlink is observed over the GSM 1800 
uplink for spectrum bands with predictability levels in the bottom 70 percent. However, for 
spectrum bands with predictability levels in the top 30 percent, a predictability superiority of 
the GSM 1800 uplink is observed over the GSM 1800 downlink. That is, although majority 
of the GSM 1800 downlink bands have superior predictability, there are some GSM 1800 
uplink bands have very high predictability levels. This somewhat conflicting observations 
might result from the fact that quite regular patterns of humans spectrum usage exist in few 
GSM 1800 uplink bands. 


^The predictability results on 2.3 GHz bands are include in Fig. 5, since Europe is currently looking at deploying 
licensed/authorized shared access within these bands. 
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Fig. 5. The cumulative distribution functions (CDFs) for the predictability of various services. 


IV. Applications: 5G Spectrum Sharing 

Radio spectrum usage is an essential issue in 5G wireless communications [12]. The explosion 
of data rates offered by mobile internet and internet of things (loTs) is overwhelming allocated 
2G/3G/4G radio spectrum. In the past, new cellular spectrum has typically been made available 
through spectrum refarming. However, clearing radio spectrum from an allocated but under¬ 
utilized usage to repurpose the spectrum band to another usage often requires many years to 
accomplish, which makes it difficult to keep pace with user demand of gigabit per second (Gbps) 
data rates for 5G [13]. On the other hand, technological innovations such as millimeter wave 
communications and visible light communications can offer very high data rates; However, these 
disruptive technologies are mainly for small cells and low mobility usage. To provide wide area 
cell types, spectrum resources below 3 GHz will be needed [14]. 

To address these challenges, spectrum sharing is contemplated as the primary candidate, which 
has been well recognized as an affordable, near-term solution of meeting the 5G radio spectrum 
requirements and increasing radio access network capacities for 5G content delivery. Specifically, 
5G spectrum sharing is well beyond the previous studies on cognitive radio-based spectrum 
sharing, since one main feature of the latter is the opportunistic primary-secondary access in 
unlicensed bands (such as TV white space). In contrast, as shown in Fig. 12, 5G spectrum sharing 
may occur in both licensed bands (e.g., GSM1800 bands, 2.3 GHz bands) and unlicensed bands 
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Potential 5G Spectrum Usage Paradigms 
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Fig. 6. A vision for potential 5G spectrum usage paradigms. 


(e.g., ISM bands, TV bands). Moreover, one distinguished feature of potential 5G spectrum 
usage is the diversity, i.e., besides the licensed exclusive access in traditional cellular networks, 
licensed/authorized shared access, unlicensed shared access (known as LTE in unlicensed bands), 
and primary-secondary access will coexist [15]. 

Spectrum prediction will play a significant role in 5G spectrum sharing. Several potential 
applications are described below: 

• Cost-efficient wideband carrier aggregation. To meet the 5G capacity requirement, it is 
known that no single band or air interface standard by itself fully suffices, and, it is inevitable 
for 5G devices to aggregate the benefits of multiple (non-continuous) spectrum bands of 
a very wide range, possibly, from several hundreds MHz bands to 30-300 GHz millimeter 
wave bands. Consequently, proactive schemes are expected to exploit the evolution dynamics 
of various spectrum bands of such a wide range, and enable wideband carrier selection and 
aggregation in a timely and cost-efficient manner. 

• Dynamic frequency selection and predictive interference mitigation. One dominant theme 
for wireless evolution into 5G is network densification, which is realized mainly by in¬ 
creasing the density of infrastructure nodes (such as base stations and relays) in a given 
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geographic area. It is anticipated that hyper-dense small cells are largely privately owned, 
and of unplanned deployment. The small cells thus need to be capable of being configured, 
optimized and healed by themselves to select the communication frequency bands and not 
to cause any noticeable interference to the existing neighborhood networks. The knowledge 
from spectrum prediction can be used by the small cells to assist such autonomous processes 
through dynamic frequency selection and predictive interference mitigation. 

V. Conclusion and Discussions 

Predicting the radio spectrum state evolution gains increasingly attention as the explosive 
growing demand for dynamic spectrum access. In this paper, statistical entropy measures and 
Fano inequality are exploited to quantify the degree of predictability underlying real-world spec¬ 
trum measurements. The results in this paper, serving as the upper-bound prediction performance, 
can provide a performance bound of various predictive algorithms and a general guidance to the 
design of future wireless communication systems. Notably, it remains a challenge for the state- 
of-the-art prediction techniques to obtain a prediction precision approaching to the upper-bound 
predictability. Further improvement of the forecast accuracy of spectrum prediction techniques 
in a real-time mode are thus required. 
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Fig. 7. The 3-D view of the evolution trajectories of one week real-world RSS in the TV bands (614^698 MHz). For each 
200 kHz spectrum band, about 3360 samples one week and thus 480 samples one day are plotted. 



Fig. 8. Impact of the detection threshold on the duty cycle in TV bands (614^698 MHz). 


August 24, 2015 


DRAFT 























































































16 



Fig. 9. The entropy of the RSS dynamics in TV bands. The number of RSS levels is set as Q = 8 for each individual 200 
kHz spectrum band and thus the random entropy = log 2 (Q = 8) = 3 bits. 



Fig. 10. The predictability in RSS dynamics for TV bands. 
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Fig. 11. The cumulative distribution functions (CDFs) for the predictability of various services. 
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Fig. 12. A vision for potential 5G spectrum usage paradigms. 
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